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Abstract 

We present a survey of the cosmological applications of the next generation of 
weak lensing surveys, paying special attention to the computational challenges pre- 
sented by the number of galaxies, Ngal ~ 10 5 - We focus on optimal methods with 
no pixelization and derive a multigrid P 3 M algorithm that performs the relevant 
computations in 0(N ga i log N ga i) time. We test the algorithm by studying three 
applications of weak lensing surveys - convergence map reconstruction, cluster de- 
tection and E and B power spectrum estimation using realistic 1° x 1° simulations 
derived from N-body simulations. The map reconstruction is able to reconstruct 
large scale features without artifacts. Detecting clusters using only weak lensing is 
difficult because of line of sight contamination and noise, with low completeness 
if one desires low contamination of the sample. A power spectrum analysis of the 
convergence field is more promising and we are able to reconstruct the convergence 
spectrum with no loss of information down to the smallest scales. The numerical 
methods used here can be applied to other data sets with same 0(iV log N) scaling 
and can be generalised to a sphere. 
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PACS: need to be entered 



1 Introduction 



Mapping the matter distribution of the universe is one of the principal aims 
of cosmology. The traditional approach to this problem has been the use of 
galaxy surveys, the 2dF Galaxy Redshift Survey (Coliess et al. 2001) and the 
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Sloan Digital Sky Survey (York et al. 2000) being the most relevant examples 
today. However, galaxy surveys only map the luminous matter in the universe; 
generalising to all forms of matter requires the additional assumption that the 
luminous matter faithfully traces the total matter distribution. On the other 
hand, weak lensing, or the coherent distortion of the shapes of background 
galaxies by intervening matter, requires no such assumption and is emerging 
as a powerful tool to map the matter in the universe. 

The possibility of lensing by large scale structure (LSS) was first pointed 
out in pioneering work by Gunn (1967) and has since been theoretically and 
numerically studied by a number of authors (Blandford et al. 1991, Miralda- 
Escude 1991, Kaiser 1992, Bernardeau, van Waerbeke, & Mellier 1997, Kaiser 
1998, Jain & Seljak 1997, Wittman et al. 2000, Jain, Seljak, & White 2000, 
Wittman et al. 2000). However, detecting this "cosmic shear" had to wait for 
advances in imaging technology and has only recently become possible. There 
now are a number of detections (Bacon, Refregier, & Ellis 2000, Van Waerbeke 
et al. 2000, Rhodes, Refregier, & Groth 2001, Hoekstra et al. 2002) by vari- 
ous groups and with more observations in progress or planned, this number 
will continue to grow. The next generation of lensing observations such as the 
NOAO deep field, the CFHT legacy survey and the Deep Lens Survey will 
go beyond simply detecting cosmic shear but will map it over large areas of 
the sky. Such large scale surveys will herald in an era of precision cosmology 
for weak lensing. At the same time, these surveys bring with them the same 
computational challenges that current CMB experiments and galaxy surveys 
are facing. The number of background galaxies is N gal ~ 10 5 to 10 6 , mak- 
ing any brute force approach prohibitively expensive. Analysing these surveys 
therefore requires the development of algorithms that make the problem com- 
putationally tractable. Furthermore, the algorithms must be able to handle 
aspects of real data including noise, incomplete sampling, arbitrary cuts and 
so on. 

This work has two principal goals; the first is the development of an algorithm 
that solves the computational challenges of large N surveys, where N can be 
number of galaxies or just pixelized intensity (as for the CMB). We then test 
this algorithm (described in detail in the Appendix) on various applications of 
weak lensing surveys using simulated data. The paper therefore serves the dual 
purpose of being a test of our algorithm as well as a survey of the potential of 
weak lensing surveys in cosmology. 

We focus on three applications of weak lensing that we believe are the most 
interesting for cosmology. The first of these, reconstructing the matter dis- 
tribution, is the basic goal of large lensing surveys. Lensing measures the 
integrated line of sight matter distribution, or the convergence (k) map. Since 
this reconstruction includes all matter, luminous and dark, such a map gives 
us a glimpse of the true distribution of matter in the universe. Such maps 
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could also be combined with the high quality imaging data of the lensing sur- 
vey to understand the relationship between luminous and dark matter. Weak 
lensing reconstructions have been considered under two broad contexts, clus- 
ter mass reconstructions (Kaiser & Squires 1993, Kaiser et al. 1995, Seitz & 
Schneider 1995) and LSS reconstructions (Seljak 1998). Since the surveys we 
are considering in this paper are large field surveys, we will focus on methods 
for the latter, including effects of non uniform sampling, irregular boundaries 
and noise. 

The next scales of interest are the largest gravitationally bound systems in 
the universe, clusters of galaxies. As clusters are the rarest and most massive 
of all structures, they provide a sensitive probe of structure formation and 
the initial conditions of the universe. One such test is the number density 
of clusters (Bahcall & Fan 1998) as a function of mass and redshift, which 
depends sensitively both on the total matter in the universe as well as the 
presence of dark energy and its properties. In order to use such tests, one 
needs to have a large catalogue of clusters, with well defined selection crite- 
ria and completeness fractions. One approach is to use large photometric or 
spectroscopic surveys of galaxies (White & Kochanek 2001) to construct such 
catalogues; lensing provides us with an alternative approach (Hennawi et al. 
2001, White et al. 2001). An advantage to lensing is that it detects all massive 
structures, whether they are luminous or not, while other methods assume the 
presence of luminous matter. Since theory predicts the total number of such 
objects, lensing allows for the most direct comparison to predictions. Also, any 
possible existence of "dark clusters" (e.g. Miralles et al. 2002) would bias the 
latter methods, while lensing surveys would be unaffected. However, lensing 
has its own disadvantages that must be understood, if it is to be reliably used. 

This paper develops a method to detect clusters in lensing surveys, as well as 
addressing the theoretical limitations of lensing for cluster searches. It then 
characterises the completeness and reliability (defined below) of such surveys. 
We also examine the potential of such surveys to constrain the profile of the 
dark matter haloes associated with such clusters. This work parallels and 
extends the work in White et al. (2001), although we use different methods. 

Finally, we address the measurement of the convergence power spectrum. The 
power spectrum is one of a host of statistics that can be measured, but it 
has emerged as the statistic of choice for cosmology, since many models pre- 
dict a gaussian random field distribution on large scales, for which the power 
spectrum contains all the information. Also, the power spectrum has a natu- 
ral interpretation from linear structure formation, and therefore is a powerful 
probe of cosmology, especially in conjuction with other astrophysical measure- 
ments (Jain & Seljak 1997, Kaiser 1998, Hu & Tegmark 1999). The techniques 
for estimating the power spectrum are similar to those already employed in 
CMB and galaxy redshift surveys. Here we adapt them to weak lensing. 
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The paper is organised as follows: We start by describing the features of our 
algorithm. In §3, we review the basic formalism of weak lensing, while §4 
describes the simulations used in the paper. We then examine each of our cho- 
sen weak lensing applications, image reconstructions (§5), cluster detections 
(§6), and power spectrum estimation (§7 ). We conclude in §8. The numerical 
algorithms used in this paper are described in the Appendix. 



2 The Algorithm: Features 

As mentioned in the introduction, large weak lensing surveys have the same 
computational challenges that the next generation of CMB and galaxy survey 
analyses will face. Maximizing the weak lensing signal involves a high number 
density of background galaxies distributed over large areas of the sky, imply- 
ing Ng a i > 100,000 galaxies. All the algorithms described in this paper are 
written in terms of matrix operations^, and the computational challenge is in 
implementing these N ga i x N ga i matrix operations efficiently The goal of this 
paper is to show that the challenge is tractable. 

The actual numerical method is described in the Appendix; we discuss the 
various features of the algorithm here. 

(1) The algorithm works in real space: There are a number of reasons why a 
real space algorithm is preferable to a harmonic space approach: 

(a) The noise properties of data are typically trivially representable in 
real space. 

(b) Real data is not uniform, but involves a number of cuts that arise 
from bad data, star removal, cosmic ray subtraction etc. In real space, 
these cuts are trivially included. In harmonic space, these cuts would 
introduce artifacts due to the non uniformity of coverage, and these 
are generically difficult to correct for. 

(c) Since we do not work directly with Fourier modes there is no need 
to impose periodic boundary conditions. Thus, there are no issues 
associated with aliasing of power from the modes on the scale of the 
survey. 

(2) There is no pixelisation required: A particularly simple solution to the 
computational challenges of large data sets is to pixelise the data if pos- 
sible. There are a number of methods to do this, and one can choose 
pixels that maximize the signal given a fixed number of pixels. Pixeli- 
sation however does involve some degree of data loss at all scales, and 
no information is preserved below the pixel scale. While this may be ac- 
ceptable for image reconstruction and power spectrum measurements on 

1 Most modern cosmology algorithms are cast in this form. 
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large scales, it is undesirable for cluster searches, where small scale infor- 
mation is essential. Furthermore, pixelisation is particularly problematic 
if the survey geometry is complicated. 

(3) The time complexity is N gai log N ga i : We have tested our algorithm in the 
range N ga i = 30,000 to 180,000 using a single processor on a workstation. 
Within this range, the computational time scales (effectively) linearly 
with the number of galaxies; the processing time ranged from a few min- 
utes for the image reconstructions and halo searches, to a few hours for 
the Fisher matrix estimation. We also note that the algorithms in this 
paper, especially the power spectrum estimation, are readily parallelized. 

(4) The algorithm is generalisable : The algorithm does not use any properties 
that are unique to weak lensing; most of the methods described here have 
either been borrowed from other applications, or else are trivially gener- 
alisable. Consequently, this algorithm can be used in other applications 
where large N is a computational restriction. 



3 Lensing Formalism 

Let us begin by briefly reviewing the basic formalism of weak lensing to es- 
tablish notation and our conventions. The reader is referred to Bartelmann & 
Schneider (2001) for more details; our treatment closely parallels the discus- 
sion in Hu & White (2001). 

The gravitational deflection of light can be described as a mapping between a 
source (S) and image (I) plane. The mapping can be written as 



where <5x is the displacement vector between two points on a given plane. In 
the weak lensing regime, the mapping has the form 



where o~i are the Pauli matrices, k is the convergence, and 7 is the two com- 
ponent shear. Naively, one might expect that this mapping depends on three 
independent parameters. However, the three components of are not in- 
dependent; the relation between them most easily expressed in Fourier space 
(we will assume a small enough patch of sky in this paper to ignore curvature 
effects), 



6xf = 




(1) 




(2) 



7i (1) 



cos(20i) 



72(1) 



k(1) sin(20!) , 



(3) 



5 



where (f)\ is the direction of the 1 mode. Within the weak lensing approxima- 
tion, the expectation value of the ellipticity is proportional to the shear. The 
proportionality constant depends on the definition of the ellipticity; we adopt 



(e) = 7 . 



(4) 



We now compute the various lensing two point statistics. To do so, we Fourier 
decompose the shear field into the so-called E and B modes, 



7i (n) 



72 



'A) = J 



d 2 i 

(2k)' 



[E(l) cos(20!) - B(l) sin(20i)] e 
[EQ) cos(20i) + B(\) sin(20i)] e 



il.n 



il.fi 



(5) 



The two point statistics of these quantities are specified by their power spectra, 



(E(l)E(Y)) = (2n) 2 5(l-Y)P EE 
(B(\)B(l')} = (2tt) 2 5(\-1')P 1 bb 
(E(l)B(\')) = (2n) 2 5(\-r)P l 



EB 



(6) 



We note that weak lensing by density perturbations only produces E modes, 
while shot noise and systematic effects can produce both. In terms of power 
spectra, P t EE = P t KK , P? B = P t BB = for weak lensing, while P t EE = 
P BB , P EB = for shot noise. Systematic effects can produce all three power 
spectra. For the rest of this paper, we will ignore P EB since it is parity violat- 
ing; we refer to P EE = Pf K as the E mode power, while P BB is the B mode 
power. 



Using the above expectation values and standard trigonometric and Fourier 
identities, we calculate the various covariance matrices, 



(1) 



C77 

^(ij)(ab) 
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(2) 
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l ab 



Jo + C4J4 S4J4 
S4J4 Jq — C4J4 

Jo — C4 J4 — S4J4 
— S4J4 Jq -\- C4J4 

ldl pnnr 



(8) 
(9) 
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Parameters Value 



0.3 



0.7 



0.9 



h 



0.7 



Boxsize 239.5 Mpc/h 



Particle Mass 6.86 x 10 10 M /h 



Table 1 



The cosmological parameters used in the N-body simulations. 



with 



I a = (-2c 2 J 2 ,-2s 2 J 2 ) . 



(10) 



(3) Sg = J 




o 



(11) 



For convenience, we have introduced the shorthand notation c n = cos(n0), s n = 
sin(n0), J n = J n (W). The displacement vector between two points is de- 
scribed in polar coordinates by (0,(j>); the components of the shear are (a, b). 
We will often omit the superscripts for notational simplicity when referring to 
these matrices if they are implicit in the context. 



4 Simulations 

The simulated convergence maps used in this paper are derived from the Virgo 
N-body simulations (Jenkins et al. 1998), run at the Edinburgh Parallel Com- 
puting Center and the Computing Centre of the Max Planck Society at Garch- 
ing. These simulations use a parallel AP 3 M code (Couchman et al. 1995,Pearce 
& Couchman 1997) to evolve 256 3 particles from z=50 to z=0. The cosmo- 
logical parameters used in the simulation are in Table 1. The dark matter 
distribution is then projected onto a series of planes in the redshift range z=2 
to z=0. Dark matter haloes with a mass greater than 10 14 M are identified 
with a FOF group finder with b = 0.2, forming the halo catalogue we use in 
the rest of the paper. 

We compute the convergence map from these planes by using discrete multiple 
plane lensing (Schneider, Ehlers, & Falco 1992), 



k = 



— n M 2^9i — Ax 
2 fri a, 



% i 



(12) 
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Fig. 1. Left: A simulated 1° x 1° n map, assuming all the sources are at z=l. 
The map was created using the method described in the text. Haloes with masses 
greater than 10 14 M are marked by squares with the sizes scaled by K/£ cr jj, where 
S cr it = -§^Q jpi) and D s , Di, Di s are the angular diameter distances to the lens, 
source and between them, respectively. The scale is between -0.06 and 0.26. Right: 
A WF reconstruction, assuming a background density of n=25 galaxies/arcmin 2 
and an unrealistically small intrinsic scatter in the ellipticities of ai n t = 0.4/100. 

where 5i is the density perturbation at plane i and is the geometrical distance 
ratio, given by 

g(x , x - ) = TM)r^l, (13) 

r(x) 



with x the comoving distance and r(x) = snix(x) the usual curvature distance. 
Two sets of convergence maps are created, 3.5° x 3.5° maps assuming all the 
sources are at z=l and 2°x 2° maps using the redshift distribution, 

n(z) oc z 2 exp(— z/zq) , Zq — 0.4 , (14) 



to weight the planes in the sum. A representative 1° x 1° subsection is shown 
in Fig. 1. In order to create "independent" maps, the origin of the planes 
is randomly chosen and we restrict ourselves to using 1° x 1° subsections to 
create the galaxy catalogues. While the maps are independent on small scales, 
this breaks down at larger scales, which manifests itself in the high precision 
numerical experiments of Sec. 6. 

In order to generate galaxy catalogues, we create shear maps by Fourier trans- 
forming the K maps and using Eq. 3. In order to eliminate artifacts from the 
periodic boundary conditions of the FFT, only the central 1° x 1° of the shear 
map is used. We randomly assign galaxies positions, and bilinearly interpo- 
late the shear map and use Eq. 4 to compute their ellipticities. In addition, 



Section 



n z distribution 



Image Reconstruction 10,25,50 



yes 



Clusters 



25,50,100 



yes 



Power Spectrum 



25 



no 



Table 2 



The table summarizes the types of galaxy catalogues used in various sections of the 
paper. The background density of galaxies (per arcminute 2 ) is given by n, while the 
third column indicates whether a redshift distribution of the sources was considered 
or not. 

each galaxy is given a random ellipticity drawn from a Gaussian distribution 
with a = 0.4. For the rest of the paper, we refer to the intrinsic ellipticities 
of galaxies as "noise". In general, the noise would also contain measurement 
errors, but we do not simulate these. Note that our method simulates the 
effect of non-uniform coverage due to the clustering of background sources, 
although we limit ourselves to Poisson clustering. Table 2 summarizes the 
different catalogues used in the various sections of the paper. 



5 Image Reconstruction 

One of the basic aims of a weak lensing survey is the reconstruction of the 
2D convergence field. A particularly simple nonparametric method for this is 
Wiener Filtering (WF) (Seljak 1998). In the case that the data are gaussian 
distributed, the WF estimator coincides with the maximum posterior proba- 
bility estimator (Zaroubi et al. 1995) and is therefore, optimal. Even when the 
data are not gaussian, this estimator still minimizes the variance (as denned 
below) among all linear estimators; however, it is no longer guaranteed to be 
optimal. For LSS reconstructions, the deviations from gaussianity on large 
scales are expected to be small, and so WF is, in an appropriate sense, the 
best that one can do. 



5. 1 Theory and Implementation 

To derive the estimator, let us organise the ellipticities of N galaxies into a 2N 
- vector, e. The estimator for the convergence k, at M points can be written 
as k, = <E»e. We wish to minimize the variance, 




(15) 
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with respect to <E>. This gives 

$ = (/ce^ee*)" 1 = S K7 (,S 77 + N)' 1 



(16) 



where S are the covariance matrices of Sec. 2, and N is the noise covariance 
matrix. We note that S" 77 + N is a 2N x 2N matrix while S* 7 is an M x 2N 
matrix. 

In addition, it is essential to generate an error estimate for the reconstruction. 
We do this by generating mock catalogues created by randomizing the galaxies' 
ellipticities and reconstructing the k map for each of these catalogues. The 
variance over N MC (=200 in this paper) maps is a measure of the error of 
the reconstructed convergence field. It is important to note that in generating 
the mock catalogues, the galaxy positions are not altered, thereby explicitly 
taking into account the non-uniform (here Poisson) sampling of the data. 

Using WF requires knowing the exact power spectrum (in the covariance ma- 
trices) for it to be optimal. For this one can just use the estimate of the power 
spectrum from the data using the methods of Sec. 6 (Seljak 1998). However, 
empirical tests demonstrate that the reconstruction is relatively insensitive 
to the power spectrum used if it is roughly approximates the true one, even 
though the estimator is no longer strictly optimal. 

Finally, on the implementation of this algorithm and others in the paper; first, 
it is not necessary to estimate k at the positions of the galaxies. Indeed, it is 
more useful to estimate it on a uniform grid; all the figures in this paper are 
reconstructions on a 256 2 or 512 2 grid. Second, while the estimator is theoret- 
ically simple, implementating the required matrix operations numerically for 
10 5 galaxies and greater, is more of a challenge. There are a number of possible 
solutions to this; we refer the reader to the appendix for our implementation. 

5.2 Results 

The WF reconstructions of the field (Fig.l) for a variety of background den- 
sities of galaxies and realistic noise are in Fig. 2, while Fig. 1 shows the re- 
construction in the case where the noise has been reduced by a factor of 100. 
The reconstructions resemble smoothed versions of the original map. This can 
be understood by considering the WF operator in Fourier space assuming 
diagonal noise. In Fourier space, the covariance matrices are diagonal with 
S K ~* ~ S" 77 ~ Pf K . The WF operator then simply weights each Fourier mode 
by Pi K I [P\ K + o" 2 ) , where a 2 is the amplitude of the noise power spectrum; i.e. 
every mode is simply weighted by signal/ (signal +noise) . Considering Fig.3, we 
see that the principal effect of the WF is to act as a low pass filter, removing 
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Fig. 2. On the left a filtered version of the original map is shown, using a WF 
appropriate for 3rd panel. The right three panels show WF reconstructions of the 
k field of Fig. 1, where only those pixels where the measured signal exceeds the 
noise are shown. Bright (blue) are over densities, dark (blue) underdensities, while 
black are pixels where signal is below noise. The density of background galaxies is 
10, 25, 50 galaxies/arcmin from left to right, respectively. The intrinsic scatter in 
the ellipticities is Gi n t = 0.4. The colour scale is identical to that in Fig. 1. 




Fig. 3. The power spectra for the WF reconstructions. The continuous [red] line is 
the power spectrum of the original k map, while the straight dashed [black] line is the 
noise power spectrum assuming n=25 galaxies/arcmin 2 . The dotted [blue] line is the 
power spectrum of the reconstruction in Fig.l, while the dashed lines are the power 
spectra from Fig. 2 (starting from bottom to top, n=10,25,50 galaxies/arcmin 2 ). 

modes greater than I ~ 1000. Reducing the level of the noise increases the 
cutoff frequency of this low pass filter, as can be seen both from the images 
as well as the power spectra. 

The advantages of implementing the WF in real space are evident from the 
lack of artifacts at the edges of the reconstructions. The presence of an edge 
is irrelevant to a real space algorithm, and it correctly reconstructs structures 
at the edge of a field. This is not true for harmonic space approaches which 
are sensitive to the entire field, and genericaliy produce artifacts at edges. 
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Fig. 4. A histogram showing the fraction of haloes with ratio of Mi ens /Mt rue hom 
200 realizations. The lensing mass was computed by directly summing the k map 
upto 0.25 x [dashed/magenta], 0.5 x [dotted/blue], 1.0 x [solid/red] the virial 
radius, r v i r , and multiplying with £ cr it assuming cluster redshift is known. The halo 
finder mass is scaled assuming an isothermal profile. 



For completeness, we also considered the reconstructions of fields assuming a 
redshift distribution of sources and verified that within our implementation of 
the redshift distribution, there is no effect on the reconstruction. This will con- 
tinue to be true for more realistic implementations of the redshift distribution 
except in the limit of sparse sampling (low number of background galaxies). 
However, WF has only limited value in the sparse sampling limit as is evident 
from the panels of Fig. 2. 



6 Clusters 



We turn to the construction and characterization of halo catalogues from 
weak lensing data. Although weak lensing, being sensitive only to the lensing 
mass, would appear to be the ideal method for constructing such catalogues, 
it has its own theoretical limitations. We examine these first, then turn to 
the construction of an optimal filter to detect clusters, and test it. Finally, we 
address the issue of constraining cluster profiles from weak lensing data. 
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Me Sl /M true 



Fig. 5. The same as Fig. 4 except only 50 realizations were considered and line of 
sight contamination was removed. 

6. 1 Theoretical Estimates 

In order to understand the limitations of weak lensing halo searches, we mea- 
sure the expected signal from our catalogue of dark matter haloes, with no 
noise added. We define the signal as the ratio M est /M true , where M est is the 
lensing mass estimated from the convergence map, while M trU e is the halo 
finder mass. The lensing mass is estimated by integrating k outward from the 
halo centre upto a fixed fraction of the virial radiusQ, while the halo mass is 
determined by scaling the halo finder mass assuming an isothermal density 
profile. The results are shown in Fig. 4. The slight trend in the mass ratio 
decreasing with radius is a result of clusters being steeper than isothermal in 
the outer parts of the cluster, such as in NFW profile. 

A feature of this plot is the presence of negative masses, i.e. approximately 
5% of the haloes are undetectable. These negative masses result from the fact 
that lensing measures the line of sight integrated mass; low mass haloes can 
therefore be masked by underdensities. The converse, low mass haloes being 
masked by heavier haloes, also occurs and is responsible for the long tail of 
Fig. 4. Removing line of sight contamination (Fig. 5) by considering each 
lensing plane individually (Eq. 12) removes both the negative masses and the 
tail, verifying our interpretation. This figure shows that the lensing masses are 
systematically greater than the halo finder masses. This is the combination of 
two effects; the true profiles are not isothermal, and so using an isothermal 

2 To get the mass, one must multiply this by £ C rit- Equivalently, one could scale 
the halo finder mass by this same ratio. 
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profile to scale the halo finder masses will underestimate the mass. Also, the 
lensing masses are 2D projected masses, while the halo finder measures the 
3D distribution. 

Another related concern are filaments oriented along the line of sight that 
could masquerade as haloes. In order to determine the degree of contamina- 
tion, we consider all pixels in our k maps that exceed a particular threshold 
(we tried a few different thresholds, but our results were insensitive to the par- 
ticular choice). We then compute the distance, A9, between the pixel and the 
closest halo. This is a measure of the correlation between the k overdensities 
and the haloes. We find that all but a negligible fraction of overdensities did 
not correspond (A6 > 1 arcmin) to real haloes, implying that contamination 
due to filaments is insignificant. 



6.2 Cluster Detection: matched filters 



The problem of halo detection can be formally stated as follows - given an 
input signal with a spatial distribution /(x) and amplitude A, the measured 
signal can be written as 

/'(x)=A/(x) + 7V(x) (17) 



where iV(x) is the noise. Assuming that the noise has zero mean and and its 
statistical properties are spatially independent, the minimum variance estima- 
tor, $, has the form (in Fourier space) (Haehnelt & Tegmark 1996), 

m - ■ us) 



Specializing to the case of lensing, the estimator is given by 

C/K-y 

* « — • (19) 



The normalization is usually determined by requiring that it be unbiased. 
However, since we are interested in the ratio of the signal to the noise, the 
normalization cancels and we leave it arbitrary. The signal matrix, S' K1 , has 
the same form as the unprimed matrix of equation 9, except that the input 
power spectrum is replaced by the Fourier transform of the halo profile. 

We now describe our halo detection algorithm. We start by convolving the 
shear map with the matched filter; in practice, this involves multiplying the 
data (organised into a vector) by the matrix in Eq. 19. Let us denote this 
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Fig. 6. Cluster detections for a variety of scale lengths and galaxy densities, using a 
threshold of a = 2. The darkness is proportional to the significance of the detection, 
saturating for a > 5. Haloes with masses over 10 14 M Q are marked by the boxes with 
the size of the box dependent on the expected signal. The columns are different 
background galaxy densities, n=25,50,100 galaxies/arcmin 2 from left to right. The 
rows are different scale lengths, 6 S = 5.0,1.0,0.5 and 0.1 arcminutes from top to 
bottom. The realizations here did not include a redshift distribution. 

convolved map by 7(x) - we must now determine whether the value of 7 at a 
given point determines a cluster or not. In order to do this, we start with the 
null hypothesis, i.e. there is no halo at x and ask whether 7(x) is consistent 
with this. Under the null hypothesis, the expectation value of 7(x) is zero 
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Fig. 7. Same as Fig. 6 except that a redshift distribution was used in the realizations. 

with variance a 2 . We define a cluster detection if 7(x) > na, where n is some 
chosen threshold. We compute a as in the previous section, by randomising 
the galaxy ellipticities while keeping their positions fixed to create 200 noise 
maps, and measuring the variance of the resulting convolved maps. 

We must emphasize a number of points at this stage. The first is that for any 
given threshold, we expect to have points not associated with a halo to have 
7(x) > na; this fraction of false detections will decrease with increasing n. If 
the noise is Gaussian, then the fraction is known analytically; however, as we 
will see in the following sections, the noise is not consistent with being Gaus- 
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sian. Indeed, the principal sources of noise come from extraneous structures 
not associated with the halo; these structures do not constitute a Gaussian 
random field. It is however possible to calibrate the expected false fraction 
from simulations and include it in theoretical analyses. We also re-emphasize 
that when computing the variance, it is important to leave the galaxy positions 
unchanged and only randomise the ellipticities. Only by leaving the galaxies' 
positions unchanged will the signal, which is dependent on the distribution of 
background galaxies, be properly estimated. 

The cluster profile is still a free parameter. A particularly simple choice is the 
singular isothermal sphere, which in projection is oc This choice has the 
disadvantage of excessively weighting the outer parts of the halo (the inte- 
grated profile is logarithmically divergent), which are both noise dominated 
and contaminated by external structures. Numerical experiments with it also 
verify that it is suboptimal. 

A simulation motivated choice is the NFW profile in projection (Navarro, 
Frenk, & White 1997, Bartelmann 1996). However, the NFW profile has a r" 1 
inner cusp, which is not resolved by our simulations. Considering the NFW 
profile, convolved with the simulation pixels, suggests a profile of the form 
(White & Kochanek 2001), 



where 9 measures angular seperation on the sky. This profile has the same 
aysmptotic 6~ 2 behaviour of the NFW profile; within the scale radius 9 S , it 
possesses a core. Note that we have introduced an angular scale into the filter, 
and that there is no way to theoretically choose this scale for all clusters, 
since they will be at different distances. When analyzing real data one may 
of course choose a projected NFW profile instead, which however still has a 
physical scale that cannot correspond to a single angular scale. We resolve this 
problem by simply using a series of scale radii to perform the reconstructions. 

The results for 9 S = 5.0, 1.0, 0.5 and 0.1 arcminutes are shown in Figs. 6 and 
7. As is evident from the figures, the scale radius corresponds to a smoothing 
scale. A larger scaling length results in fewer false detections, but tends to 
coalesce seperate haloes into single structures and misses a greater fraction of 
clusters. Decreasing the scale radius resolves seperate structures better, but is 
also more susceptible to noise and extraneous structures, as can be seen from 
the number of detected structures that do not correspond to haloes. 
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Fig. 8. The completeness fraction for different thresholds and scale lengths, com- 
puted from 200 realizations, with a background density of n = 25 galaxies/arcmin 2 . 
The panels are labelled by the scale length in arcminutes. The solid [black], dotted 
[red] and dashed [blue] lines represent thresholds of a = 2, 3 and 4 respectively. The 
bin size is 0.05 r„j r , implying the central completeness (r < 0.1r\,j r ) is contained in 
the first two bins. 
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Fig. 9. Same as Fig. 8 except that a redshift distribution was used in the realizations. 

6.3 Completeness/ Reliability 

It is useful to recall at this stage that we have two lists of data, a list of 
haloes from the halo finder and a list of pixels whose halo signal was above a 
certain threshold. There are two obvious questions one can ask - what fraction 
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of haloes have associated pixels, and what fraction of pixels have associated 
haloes. We define the completeness as the fraction of haloes that have their 
closest detected pixel a specified distance away from them. Conversely, the 
reliability is defined as the fraction of pixels that have their closest halo a 
specified distance away. We note that the natural measure of distance for 
completeness is in units of the virial radius, whereas the reliability distance is 
measured clS db physical angular distance. 

The completeness fractions, for three different thresholds, are shown in Figs. 
8 and 9. We make the following observations about these results: 

(1) As expected, increasing the threshold reduces the completeness fraction. 
Note however that this happens at 2-3cr level, so one cannot choose a 
high level of significance (thereby rejecting spurious detections with high 
confidence) and have a high level of completeness at the same time. 

(2) The width of the completeness distribution is also seen to widen as the 
scale length decreases. This can be understood heuristically by noting 
that as the maps become more noise dominated, the point closest to the 
halo may be unrelated to it, thereby broadening the distribution. A crude 
modelling of the distribution in the limit of purely random, independently 
distributed points obtains, 

f(x) oc x (1 - x 2 ) n ; x = r/r vir , (21) 

which, for n ~ 5 resembles the distribution for the smallest scale ra- 
diusQ. Note however that this is meant to be illustrative, and not a close 
approximation to the actual distribution. 

(3) A redshift distribution of the sources decreases the completeness fraction. 
This is the result of two competing effects - using a redshift distribution 
allows one to probe a larger volume of space, and one would hope to detect 
more clusters in that region. On the other hand, the redshift distribution 
results in a more non uniform signal which reduces the signal. 

The corresponding reliability fractions are shown in Figs. 10 and 11, and the 
fraction of detected pixels that had no corresponding halo within 5 arcminutes 
is in Table 3. Note that we have only shown the fractions for the simulations 
without redshift distributions; including a redshift distribution produces sim- 
ilar results. We note the following: 

(1) The reliability fraction becomes more peaked as the threshold increases. 

(2) The above trend becomes more pronounced as the scale radius decreases, 
implying that a higher threshold is necessary to eliminate noise at these 
scales. However, the number of detected pixels at both high thresholds 
and small scales also steeply decreases, again pointing to intermediate 

3 The distribution assumed n points distributed within an area of 7rr^ r . 
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Fig. 10. The reliability fraction for different thresholds and scale lengths. The panels 
are labelled by the scale length in arcminutes. The solid [black], dotted [red] and 
dashed [blue] lines represent thresholds of a = 2, 3 and 4 respectively. 




Fig. 11. Same as Fig. 10 except that a redshift distribution was used in the real- 
izations. 

scales for optimal cluster detection. 

(3) Unlike the completeness, the reliability appears to be insensitive to a 
redshift distribution. 

(4) Finally, the dropoff near the origin is a purely geometrical effect, due to 
the fact that the differential area grows linearly with radius. 

It is also useful to consider completeness as a function of halo mass, we do 
this in Figs. 12 and 13. The completeness plotted is an integrated quantity - 
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Table 3 

The fraction of detected pixels that did not have a corresponding halo within 5 
arcminutes, as a function of the scale radius, 9 S and the threshold, a. For simplicity, 
we have only shown the figures for simulations without a redshift distribution; the 
case with a redshift distribution is similar. 
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Fig. 12. The completeness fraction (integrated to 0.2 function of halo mass. 

The panels are labelled by the scale length in arcminutes. The solid [black], dotted 
[red] and dashed [blue] lines represent thresholds of a = 2, 3 and 4 respectively. 

the fraction of haloes that have a detected pixel within 0.2 r„ r of them. The 
completeness is a steep function of halo mass, with the surveys being nearly 
100% complete at the high mass end, but less than 5% complete for the lowest 
masses. The noisy nature of these plots is simply an artifact of the fact that 
our simulations have few massive haloes. 

An important observation to make is that the achieved completeness is signif- 
icantly lower than what one might have expected theoretically. This discrep- 
ancy can be traced back to the assumption of a uniform signal in our theo- 
retical estimates. The Poisson clustering of the background galaxies does not 
satisfy this assumption; indeed, configurations of background galaxies make 
certain haloes undetectable. This would be even worse if one included cluster- 
ing of the background galaxies, and must be taken into account when com- 
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Fig. 13. Same as Fig. 12 except that a redshift distribution was used in the real- 
izations. 




Fig. 14. The integrated k profile created by stacking clusters in three mass bins, 
1-3 x [dashed/blue], 3-6 x [dotted/red], > 6 x [dot-dashed/green] 10 14 M . The 
profile is arbitrarily normalized to unity at the virial radius. The solid [black] line 
is arbitrarily normalized c=4 projected NFW profile shown for comparison. 



puting predicted cluster finding efficiency as a function of halo mass. 
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Fig. 15. Same as Fig. 14 except for the tangential shear profile. 
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Fig. 16. The tangential profile for the profile in Eq. 20 with 9 S = 5.0 [solid/black] , 1.0 
[red/dotted], 0.5 [blue/dashed] and 0.1 arcminutes [dot-dashed/green]. The normal- 
isation has been adjusted to agree with Fig. 15, and the virial radius of the cluster 
has angular size 2.5 arcminutes. 

6.4 Cluster Profiles 

There has been considerable interest recently (Sheldon et al. 2001) in using 
weak lensing to statistically constrain cluster halo profiles. We examine this 
by considering both the integrated mass profile (Fig. 14) and the tangential 
shear profile (Fig. 15). The integrated mass profile is simply given by summing 
our convergence map with no noise added. For the shear profile, we use our 
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simulated catalogues. The ellipticity of each galaxy is decomposed, relative 
to the halo center, into a tangential and radial component, which we then 
average in radial bins to get the profile. Unlike the integrated mass profiles, 
the shear profiles are estimated in the presence of noise. 

We divide clusters into three mass bins, and average the profiles within those 
bins assuming we know their central positions. This simulates the process of 
"stacking" clusters with centers known from X-ray or optical data. The profiles 
are seen to be well constrained for the highest mass bin and poorly constrained 
at low masses. 



We can also compare the measured shear profiles to what one would expect 
if one assumed a convergence profile of the form in Eq. 20. The expected 
profiles for the four scale radii we use are shown in Fig. 16. As we might 
have anticipated from the results of the two previous subsections, the best 
agreement is when 9 S is between 0.5 and 1.0 arcminutes. It is also important 
to emphasise that the dropoff seen in Fig. 15 is an effect of the pixelisation of 
the map; a similar effect is seen in Fig. 16 where the core radius mimics the 
effects of pixelisation. If clusters had significant cores, then just such an effect 
would be also physically expected. 



7 Power Spectrum Estimation 



In this section we consider the measurement of the convergence power spec- 
trum from weak lensing data. The subject of the optimal measurement of the 
power spectrum from noisy data has received a lot of attention, with regards 
to the CMB and galaxy redshift surveys, and we will simply import the tech- 
niques that have been developed and validated there to weak lensing. It should 
be emphasized that weak lensing, unlike galaxy redshift surveys, measures the 
matter power spectrum directly, eliminating the complications of bias. 

Since the formalism for optimal weak lensing power spectrum measurement 
has been discussed in detail elsewhere (see for eg. Seljak 1998, Tegmark et al. 
1998), we will limit ourselves to a brief discussion (following the notation in 
Padmanabhan et al. 2001) to establish the formalism we will be using. A 
similar discussion that uses a different, although related, approach is in Hu & 
White (2001). 
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7.1 Theory 



Let us parametrize the power spectrum by Np step functions such that P(l) = 
Pi for < I < k, where i ranges from 1 to N p . We can now arrange the 
Pi into an N p -vector, p. The problem now reduces to estimating p from the 
data x, where x is a 2N vector consisting of the galaxy ellipticities. Define the 
covariance matrix of the data, C = S" n + N where S" 77 is the signal covariance 
matrix (see Sec. 2) while N is the noise matrix. Recall that iV = N meas + N int 
where N meas is the measurement noise, while N int is the intrinsic noise due 
to galaxy ellipticities. This intrinsic noise is not known a priori and must be 
estimated from the data, an issue we address at the end of this section. 

Since the power spectrum is the sum of step functions, we rewrite the covari- 
ance matrix, 

N p Np 

C = '£p i C i + N = '£C i , (22) 

i=l i=0 



where Cj = dS 11 jdpi for 1 < % < N p , and we have defined Co = N and 
introduced a dummy parameter, p = 1 for notational convenience. 

We now form the minimum variance quadratic estimators (Hamilton 1997a, 
Hamilton 1997b, Tegmark 1997), 

q t = ^x'C-^C^x (23) 



and group them into an N p +1 vector, q. The quadratic estimators have the 
following properties, 



(q) = 

(q^)-(q)(q) t = ^ 



(24) 



where F is the Fisher information matrix (Tegmark et al. 1997 and refs. 
therein) , 



-tr 



(25) 



The best estimate of the power spectrum can be written as 

p = M(q-b) (26) 
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M 



Properties 



(J2j=i Fij)^ 1 $ij minimum variance, 

correlated errorbars 
anti-correlated errorbars, 
delta function windows 
[£^i(^ 1/2 )u] -1 (-F _1/2 )ij unC orrelated errorbars 
Table 4 ~~ 

The various choices for M (Eq. 26) and their properties. 

where b = F i0 , and we restrict all indices to run from 1 to N p . The matrix M 
is an arbitrary N p x N p matrix with the property that the rows of MF sum 
to unity and is chosen such that our power spectrum estimates have certain 
desirable properties. The choices for M that we consider and their properties 
are listed in Table 4. Note that there is no gain in information in this final 
step; the information content is the same as what is in the quadratic estimators 
and the entire Fisher matrix. However, if one only presents the diagonal of the 
Fisher matrix (eg. errorbars in a plot), then it is important to ensure that the 
errorbars are uncorrelated (the third choice). Our estimates p now have the 
following properties, 

(p) =Wp = MFp 
(pp*) — (p)(p)* — MFM* , (27) 

where we note that the rows of W have the obvious interpretation as window 
functions. 

7.1.1 Choosing a prior 

In the preceding discussion, we have implicitly assumed a prior power spec- 
trum, and all the properties have rested on this prior being the true power 
spectrum. This begs the question - how does one choose the prior? Any rea- 
sonable guess to the power spectrum works as an initial guess. The power 
spectrum estimated then can be used as the prior, and the process can be 
iterated. Several authors (Bunn 1995, Padmanabhan et al. 2001 ) have explic- 
itly verified that the choice of prior does not bias the result, and therefore, in 
practice, only one or two iterations are required. 

7.1.2 Measuring the intrinsic ellipticity scatter 

The intrinsic scatter in the ellipticities can be estimated by simply computing 
the r.m.s. value of the ellipticity of the data. This will produce the correct 
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Fig. 17. The E mode power spectrum, averaged over 200 Gaussian realizations. 
The [red] triangles show the input power spectrum, while the heavily shaded [black] 
regions are the same power spectrum convolved through the window functions. The 
width of the regions are the errors predicted by the Fisher matrix. The lightly shaded 
[green] regions are the estimated power spectrum, with the widths representing the 
errors calculated from the realizations. 

answer so long as the shear correlation length is smaller than the size of the 
field considered. However, if the scatter is underestimated, then the extra 
power will manifest itself as excess power in the E and B modes. In order 
to correct for this, we follow Hu & White (2001) and introduce a shot noise 
parameter, p no i se , to the power spectrum, whose contribution to the power 
spectrum is C no i se = Pnoise^ij- As with the power spectrum, the excess power 
measured this way can be corrected in the next iteration. 



7.2 Results 

The Fisher matrix formalism is only exact for Gaussian random fields, and 
so, in order to test it, we simulate 200 gaussian random convergence fields 
with a known E mode power spectrum, and no B mode power. Figs. 17 and 
18 summarize the results. The triangles are the input power spectrum, while 
the heavily shaded boxes are the input power spectrum convolved with the 
window functions. The width of the boxes are given by the Fisher matrix 
error estimates. The lightly shaded boxes are the measured power spectrum, 
averaged over 200 realizations, with the width of the box given by the standard 
deviation over the realizations. If the method is correct, then both the power 
spectrum and the errors should agree. The agreement is best for the E mode 
power, where the input power was non-zero. We note that although the input 
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Fig. 18. The same as Fig. 17 except for the B mode power. Note that there was no 
input power; however, there is power expected in the leftmost band due to leakage 
of power through the window function. 



100 1000 10 4 

1 



Fig. 19. Same as Fig. 17 except that the 200 realizations were drawn from N-body 
simulations instead. 

B mode power was zero, we expect to see power measured due to leakage from 
the E modes through the window functions. This is significant only for the 
leftmost band, that probes scales larger than the surveyQ We observe that 
the remaining B mode power is consistent with zero, as expected. 



Figs. 19 and 20 show the expected and measured power for 200 realizations 



The modes were included to prevent aliasing of power. 
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Fig. 20. Same as Fig. 18 except that the 200 realizations were drawn from N-body 
simulations instead. 



drawn from N-body simulations, which allow us to test whether the breakdown 
of the gaussian approximation biases our results. We find that the Fisher 
matrix underestimates the true errors in the power spectrum. A similar effect 
was found by Hu & White (2001). Although this is not a large effect, N-body 
simulations must be used to calibrate the Fisher matrix errors when using the 
power spectrum estimates to extract cosmological information. The second 
observation is the discrepancy between the expected and measured E mode 
power in the leftmost bands. This is due to the fact that all our realizations 
were derived from the same N-body simulations, and although we expect the 
realizations to be independent on small scales, this will break down at large 
scales. Fig. 17 verifies this expectation. Note that these bands probe scales 
larger than those simulated; it is interesting that we correctly estimate the 
power to within a factor of two in these bins. 



All of the above results use the third choice of Table 4 for M. Figs. 21 and 22 
contrast the first and third choices for M. As expected, the errorbars for the 
first are smaller, but are correlated, as compared to the third. The effects of 
decorrelating are pronounced for the B mode power; the points are randomly 
distributed above and below zero, while in Fig. 21, they are clearly correlated. 
We therefore reemphasize the importance of decorrelating points when visually 
presenting data. We do not present the results for the second choice since the 
errorbars are too large to be useful in this case. This is a direct result of the 
fact that the window functions in this case are delta functions. 
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Fig. 21. The measured power spectrum from a single N-body realization using 
choice 1 of Table 4. The solid lines are the expected power, while the boxes are 
the expected power with Fisher errors. The points are the deviation of the B mode 
power from zero, shown by the dashed line. 




Fig. 22. Same as Fig. 21 except that it uses choice 3 of Table 4. 
8 Discussion 



Weak lensing is emerging as an important tool in cosmology. One of its prin- 
cipal advantages is that it probes the matter distribution directly, making 
no assumptions about the dynamical state of the matter. This is desirable 
both because it eliminates complications of interpretation, but also because 
it gives us an opportunity to study the physical processes underlying those 
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assumptions. 

In this paper, we considered three applications of large weak lensing surveys. 
We summarise the results here, comparing it to previous work. 

8.1 Image Reconstruction 

The simplest goal of a weak lensing survey is to produce a map of the dis- 
tribution of matter in the universe. Wiener filtering provides a simple, and 
almost optimal reconstruction of the matter distribution, and is our method 
of choice. An important characteristic of Wiener filtering is that it suppresses 
power on scales with low signal to noise. For the weak lensing maps we con- 
sider here, this scale is at I ~ 1000, while there is no power at scales larger 
than / ~ 360, the scale of the survey. This suggests the obvious design strat- 
egy of large area surveys to probe large scales, with high background number 
densities n > 25 to resolve features on small scales. However, Wiener filtering 
has limited cosmological applicability because of its inability to resolve smaller 
structures. 

8.2 Cluster Detection 

Clusters, as the most extreme structures in the universe, are a sensitive probe 
of cosmology. Weak lensing has the advantage that it searches for clusters 
directly as mass enhancements, independent of the presence of luminous mat- 
ter. However, in order to compare with theoretical predictions, one must, in 
addition to compiling a catalogue of clusters, understand the selection criteria 
and the completeness of the catalogue. 

A disadvantage of weak lensing is that it measures the projected mass distri- 
bution, and is therefore susceptible to contamination from uncorrelated haloes 
in the line of sight. This is worst for the low mass haloes, since they can be ob- 
scured by either more massive haloes or underdensities. More massive haloes 
are less sensitive to this effect and the theoretical maximum completeness 
approaches 100%. 

The philosophy that we adopt in this paper is that weak lensing cluster 
searches will be done in conjunction with other measurements such as galaxy, 
X-ray or Sunyaev-Zel'dovich surveys. One could either imagine using weak 
lensing to identify candidate clusters and verifying them with follow up mea- 
surements, or correlating different measurements to remove false detections. 
With this in mind, we develop an optimal filter approach to detecting clus- 
ters. Detections are defined as those regions where the measured signal to 
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noise ratio exceeds a certain threshold. We then compute the completeness 
(the number of haloes with a corresponding detection) and the reliability (the 
number of detections with a corresponding halo). We observe that complete- 
ness is strongly dependent on the mass of the cluster; for cluster masses > 6 
x 10 14 M , our cluster samples were virtually 100% complete independent of 
the threshold used, implying that one could construct a uncontaminated, yet 
complete sample, by choosing a high enough threshold. At lower masses, the 
completeness drops drastically as the threshold in increased. At low thresholds 
however, the false detection rate rises to ~ 25%. This is a direct result of the 
fact that our estimator is sensitive to spurious structure masquerading as a 
halo at low thresholds. We should however note that even for low thresholds, 
the completeness for low mass haloes is less than 40%. 

The optimal filter defines a scale length that physically corresponds to a 
smoothing scale. Excessive smoothing reduces the number of false detections, 
since the noise is reduced but it also smooths away or coalesces small sclae 
structures. As we reduce the smoothing scale, we start to resolve smaller struc- 
tures but are more susceptible to spurious detections from extraneous struc- 
tures that don't belong to any halo. The optimal scale length is ~ 1'. However, 
some caution should be exercised when interpreting this value since it may be 
affected by pixelisation in the N-body maps. 

This work parallels a similar study by White et al. (2001), although we use 
different methods to detect clusters. White et al. (2001) use both a simple 
Gaussian smoothing with a smoothing scale of 1' - 2', and aperture mass 
measures (Schneider 1996) with a scale of l'-5' and a signal to noise threshold 
varying from S > 1 to S > 5. As with our methods, the completeness drops 
with increasing scale radius, with the maximum at ~ 1', although White et al. 
(2001) do not consider smaller scale lengths. In addition, they conclude that 
their catalogues are complete for masses > 5 x 10 14 M , consistent with our 
results. 

An extension of this is the inclusion of a redshift distribution of the background 
sources. Including this distribution does not qualitatively change any of the 
trends one observes for, but reduces the overall completeness. The halo cata- 
logues, constructed assuming a redshift distribution, are complete for masses 
> 8 x 10 14 M Q . This is what one might naively expect - the most massive 
haloes are unaffected by the redshift distribution, while the signal from the 
less massive haloes is reduced and a larger fraction of them are missed. The 
reliability is however unaffected by a redshift distribution, because false de- 
tections are due to noise and extraneous structure, and one would not expect 
those to be affected by a redshift distribution. 

There are two cautionary lessons that we draw from this work. The first is 
that the optimal nature of matched filters is easily affected by deviations from 
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ideality. These are caused here by deviations from circular symmetry due to 
extended haloes, as well as internal and external substructure. The second is 
that the completeness of cluster surveys are strongly affected by clustering 
of the background galaxies. Certain haloes are rendered undetectable by the 
Poisson clustering of our simulated background galaxies. Real background 
galaxies are not Poisson distributed, but are correlated, and we expect this 
to have a worse effect. Therefore, all analyses must take into account the 
clustering of source galaxies to get a representative measure of the expected 
completeness of the survey. 

We note that we have restricted ourselves here to consider the information 
from lensing only. However, lensing surveys will produce high resolution multi- 
colour images of the survey region. One might imagine using both the lensing 
and imaging data simultaneously to detect clusters; such an approach would 
naturally reduce the number of false detections. Another advantage to us- 
ing the imaging data jointly with the lensing is that each complements the 
other - lensing would identify haloes with no optical counterpart, while optical 
searches could detect objects with low lensing signals. 



8.3 Power Spectrum 

The convergence power spectrum is a measure of the clustering of the recent 
universe, a regime that was accessible until now only to redshift surveys. In 
contrast with the galaxy power spectrum, the convergence power spectrum is 
not affected by the complications of biasing, and provides a direct measure of 
the matter power spectrum. 

The formalism for estimating the power spectrum, borrowed from the CMB 
and galaxy surveys, assumes that the convergence field is Gaussian distributed. 
While this is not true for the true convergence field on small scales, it is ap- 
proximately true on the scales that lensing probes. We have explicitly shown, 
using N-body simulations, that the Gaussian approximation does not bias the 
estimators. However, the Fisher matrix underestimates the error bars and al- 
though this is not a large effect, it means that the error bars must be calibrated 
against N-body simulations. Our results are similar to those obtained by Hu 
& White (2001) although our methods differ in details. The main difference 
is that our method does not pixelise and so extracts all the information from 
the data. 

In concluding, we observe that the methods presented here are not unique 
in their application to weak lensing. We have used methods developed for 
analysis of other data sets and adapted them to weak lensing. New numerical 
solutions presented here may be adapted to other similar problems in cosmol- 
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ogy, particularly those where brute force evaluations are prohibitely expensive. 
In our application we have managed to reduce an instrinsic 0(N 3 ) numerical 
problem to O(NlogN). The same methods can be used in other applications, 
such as the analysis of CMB temperature and polarization anisotropies and 
galaxy surveys. We expect that both lensing and other areas of cosmology will 
benefit from the growing synergy in the field. 
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A Numerical Implementations 



An important feature of all the algorithms presented in this paper is that 
they are explicitly written as linear algebra operations. The basic building 
block of any implementation is therefore a routine to perform matrix-vector 
multiplication. Unfortunately, the dimensionality of the vector space, N, is 
given by the number of data points. A naive application of the algorithms 
yields iV ~ N ga i 100, 000 for which straight matrix-vector multiplication, 
an N 2 process, becomes computationally impractical. 

One can approach this problem in two ways, either by reducing N, or by 
using properties of the matrices that appear to speed up the vector multipli- 
cations. The former approach is equivalent to pixelising the data and a number 
of pixelisation schemes have been suggested, ranging from direct binning to 
"optimal" Karhunen-Loeve pixelisations of the data. 

We propose and implement an approach that is based on the latter approach. 
We start by observing that all the matrix operations that we require are 
of the form Cx or C _1 x where C is a correlation matrix of §3. The latter 
operations can be recast as direct matrix-vector multiplications by performing 
the matrix inversion iteratively (we specify the exact algorithm below). We 
then observe that the multiplication Cx is simply a convolution of the data 
by the appropriate correlation function, (, 

(Cx); = C H x i = 2 C(r; - Tj)xj , (A.l) 

3 j 

where the last identity follows from the fact that the correlation function only 
depends on the seperation between the two points. The problem is now explic- 
itly translationally invariant and one can readily apply the Fourier convolu- 
tion theorem to perform the vector multiplications efficiently The asymptotic 
scaling is now O(iVlogiV) instead of 0(N 2 ), making iV ~ 10 5 tractable on a 
workstation. 

The careful reader will no doubt point out that the use of Fourier methods 
requires the data to be uniformly sampled, which our data is not. We solve 
this problem by resampling the data onto a grid of N gr id ~ 4iV 9a j points where 
the additional factor of 4 comes from the need to zero pad in 2 dimensions. 
The exact scaling of the algorithm is therefore N grid log N grid . We emphasise 
that although an auxiliary grid is used, this grid is only an intermediate step 
which does not impose periodic boundary conditions, and whose discreteness 
effects can be compensated using multigrid and direct summation methods 
as described below. Our approach is thus conceptually very different from 
pixelisation approaches. It is also useful at this stage to point out the obvious 
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analogy between our approach and gravity solvers of PM (particle - mesh) 
N-body simulations. 

While the above approach works well for large scale modes, the pixelisation in- 
troduces inaccuracies at scales comparable to the pixel resolution. Our analogy 
with N-body simulations come to the rescue here; PM simulations have similar 
inaccuracies on small scales that can be corrected by introducing a direct sum- 
mation between pairs of particles at small seperations (P 3 M - particle-particle 
particle- mesh simulations). We start by splitting the convolution kernel into 
short and long range pieces, 

C = Clang + (short = + g(r)£ , (A.2) 



where / and g are filter functions with the properties that 



9 = 


1 (t <! Tmin) 


f = 


1 (t > T max ) 


f = 


(r < r min ) 


9 = 


(r > r max ) 





(A.3) 



with the definitions of long and short range are determined by r min and r max . 
The multiplications are now done in two stages; the first is to do the long 
range piece by the Fourier method described above, while the short range 
correlations are done by direct summation. Possible filter functions have the 
form 1 — cos 2 (#) in the regime r m j„ < r < r max , where 9 is an appropriately 
scaled length. This form is chosen to minimise the inaccuracies that result 
from the truncation of the correlation function. 



A second scheme to improve the resolution is to use a multi-grid approach. The 
single-grid scheme described above is only the simplest such implementation. 
One can trivially generalise it to multiple scales by introducing a series of 
filters, fi. In such implementations, the direct summation is performed only 
for the shortest range; all other convolutions are done by the Fourier method 
with the grid becoming coarser for larger scales. Our codes used 3 scales - the 
innermost scale for direct summation, an intermediate scale with twice the 
resolution, and a coarse grid for the largest scales. 

Note that while the O(NlogN) scaling is dependent on a flat space as- 
sumption, it can be generalized to the sphere with no significant loss of ef- 
ficiency. This is because on a sphere only the coarse grid needs to be done 
with the slower 0(N^ TSC ) scaling, while the subsequent grids can still be 
0(N g[id log A/grid) as long as the flat sky approximation holds on these grids. 
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A.l Matrix Inversions 



The problem we focus on is of the form 

y = Cx = (S + iV)x , (A.4) 

where y is known, S is the signal matrix and N is the noise. We will assume 
that N is diagonal in real space here, although the algorithm can be modi- 
fied for the case of sparse N. Also, we assume that multiplication by S can 
be efficiently performed by the methods above. The matrix inversion is then 
performed by an underrelaxed Jacobi iteration, 

x n+l = x n + Qjj + ^-1 [y - (5 + iV)x n ] , (A.5) 



where / is the identity matrix, and S, the underrelaxation parameter, is equal 
to e max , the maximum eigenvalue of S. This can be estimated using the iter- 
ative scheme, 

x «+l = ox 
II x ™ II 

&max || x || i (A-6) 

which can intuitively be verified by considering the case of S diagonal. 

The underrelaxed Jacobi iteration is only one of a number of possible ap- 
proaches to computing the matrix inverses. For our matrices, we obtained a 
fractional precision of 10~ 4 in ~ 100 iterations and were therefore not limited 
by it. However, for ill-conditioned matrices, it might be necessary to go to 
conjugate gradient or multigrid methods. 



A. 2 Trace estimation 



The only operation that cannot be trivially written in terms of matrix-vector 
operations is the computation of the Fisher matrix (Eq. 25) which involves the 
trace of the product of four matrices. This is an intrinsically 0(N 3 ) process. 
If we assume a random ensemble of vectors, x, with the property that, 

(xx*) = I, (A.7) 

we can use the following statistical identity to estimate the trace, 

tr(A) = tr(AI) = tr(A(xx*)) = xAx* . (A.8) 
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The estimator is completely determined by specifying the ensemble x. Note 
that if we choose an ensemble of dim(x) orthogonal vectors, then we can ex- 
actly recover the trace; however, taking the trace would then be an 0(N 2 log N) 
operation. This is already a gain from 0(N 3 ) and is achieved by the fast 
0(N log N) convolution methods of Cx and C _1 x operations. However, 0(N 2 log N) 
is still slow compared to other operations. One might attempt to modify this 
by choosing a smaller subset of random, but orthogonal, vectors; we how- 
ever find that this is slow to converge to the correct value. We obtain best 
convergence by using the real stochastic Z 2 esimator, where x is a random 
vector consisting of l's and -l's. For iV = 90,000, we measure the trace with 
a fractional precision of ~ 10~ 5 with an ensemble of 400 random vectors. This 
means that we have reduced the scaling from 0(N 3 ) to 0(N log N) only. 
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